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Abstract 

We show that it is feasible to carry out exact Bayesian inference for non-Gaussian 
state space models using an adaptive Metropolis-Hastings sampling scheme with the 
likelihood approximated by the particle filter. Furthermore, an adaptive independent 
Metropolis Hastings sampler based on a mixture of normals proposal is computation- 
ally much more efficient than an adaptive random walk Metropolis proposal because 
the cost of constructing a good adaptive proposal is negligible compared to the cost 
of approximating the likelihood. Independent Metropolis-Hastings proposals are also 
attractive because they are easy to run in parallel on multiple processors. We also show 
that when the particle filter is used, the marginal likelihood of any model is obtained 
in an efficient and unbiased manner, making model comparison straightforward. 
Keywords: Auxiliary variables; Bayesian inference; Bridge sampling; Marginal likeli- 
hood. 
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1 Introduction 



We show that it is feasible to carry out exact Bayesian inference on the parameters of a general 
state space model by using the particle filter to approximate the likelihood and adaptive 
Metropolis-Hastings sampling to generate unknown parameters. The state space model can 
be quite general, but we assume that the observation equation can be evaluated analytically 
and that it is possible to generate from the state transition equation. Our methods are 



justified by the work of 



Andrieu et al 



(120101 ) who show that the approximate likelihood is 



the density of the observations conditional on the parameters and a set of auxiliary uniform 
variables, with the states integrated out. 

We consider a three compone nt version of the adaptive random walk Metropolis proposal 



of 
of 



Roberts and Rosenthall (120091) and the adaptive independent Metropolis Hastings proposal 



Giordani and KohnI (120101 ) which is based on a mixture of normals approximation to the 



posterior density. We show that the adaptive independent Metropolis Hastings proposal can 
be much more efficient than the adaptive random walk Metropolis proposal in terms of the 
computing time required to achieve a given level of accuracy for three reasons. The first reason 
is that it is important to construct efficient adaptive proposals because the appro ximate 



likelihood is stochastic and not a smooth function of the parameters (see 



Pitt 



20021 ). This 



means that small changes in the parameters can result in large changes in the approximate 
likelihood so that a sampling scheme such as a random walk that changes the parameters 
by small amounts to try and obtain adequate acceptance may not work well in this context. 
Second, it is worthwhile constructing efficient adaptive proposals because the cost of the 



adaptation steps is negligible compared to the cost of approximating the marginal likelihood 
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using the particle filter. The high cost of approximating the likelihood occurs as it is necessary 



to use a large number of particles to obtain an adequate approximation and it is necessary 
to run the particle filter thousands of times for simulation based inference. Third, it is much 
easier to run an adaptive independent Metropolis Hastings scheme in parallel on multiple 
processors than an adaptive random walk Metropolis scheme and such parallel processing 
can reduce computational time significantly for a given level of accuracy; in many of our 
examples the reduction is by a factor of five to thirty when running in parallel on eight 
processors. 

Our article also shows that when particle filtering is used, the marginal likelihood of any 
model can be obtained using bridge sampling or importance sampling in an efficient and 
unbiased manner making model comparison straightforward. The methodology is illustrated 
empirically using challenging models and data. 

Adaptive sampling methods are simulation methods for carrying out Bayesian inference 
that use previous iterates of the simulation to form proposal distributions, that is, the adap- 



tive samplers 



Haario et al. 



earn a bout the posterior distr i bution fror n previous iterates. See f or exa mple 



(1200 ih 



Atchade and Rosenthal! (120051 ) and 



Roberts and Rosentha. 



Giordani and Kohn 



20091) who 



torn who 



consider adaptive random walk Metropolis proposals and 
base their proposal on a mixture of normals. Adaptive sampling is particularly attractive 
when the particle filter is used to approximate the posterior density because it is difficult to 
form proposal densities by constructing approximations that require derivatives of the log 
likelihood. 



Particle filtering (also known as sequential Monte Carlo) was first proposed by 



Gordon et al 



(119931 1 for online filtering and prediction of nonlinear or non-Gaussian state space models. 



The auxiliary particle filter method was introduced by iPitt and ShephardI (119991 ) to improve 
the performance of the standard particle filter when the observation equation is informa- 
tive relative to the state equations, that is when the signal to noise ratio is moderate to 
high. T here is ari exteii si ye literature on onli n e filtering using the particle filter, see fo r 



example 



Kitagawal ( 



1996h. 



Andrieu and Doucet 



Liu and Chen ( 



(I2OO2I) 



19981 



Doucet et al. 



Fearnhead and Clifford (2003) and 



article considers only the standard partic 



iliary particle filter of 



Pitt and Shephard 



e filter of 



(2000) 



Doucet et al. 



Del Moral et al 



Gordon et al 



(Il999h . 



( 200 If ) 



mom . Our 



I993I ) and the generic aux- 



Pitt 



le lite rature on using the particle filter to learn about model parameters is more limited. 



(I2OO2I ) proposes the s mooth particle filte r to estimate the p arameters of a state space 



using maximum likelihood. 



StorvikI (120021 ) and 



Poison et al 



learning when sufficient statistics are available. 



Andrieu et al. 



for off line parameter learning using the particle fi lter. 



insightful discussion of the results of 



Andrieu et al. 



(120081) con sider online parameter 



(120 lOl) provide a fr amework 



lury and ShephardI (120081 ) give an 



(120101 ) and use single parameter random 



walk proposals to carry out off-line Bayesian inference. 



2 State space models 

Consider a state space model with observation equation p{yt\xt; 6) and state transition equa- 
tion p{xt\xt^i; 6), where yt and Xt are the observation and the state at tim e t and ^ is a vector 



of unk nown parameters. The distribution of the initial state is p{xo\9). See 



Cappe and Ryden 



(I2OO5I ) for a modern treatment of general state space models. The filtering equations for the 
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state space model (for t > 1) are flWest and Harrison 



1997 



, pp. 506-507) 



(la) 
(lb) 
(Ic) 



where i/i-t = {yi, . . . , yt}- Equations f|Taj) -f|Tc |) allow (in principle) for filtering for a given 6 
and for evaluating the likelihood of the observations y = yi^T, 



Pixt\yi:t-i;d) = J p{xt\xt-i;9)p{xt^i\yi.,t^i;9)dxt-i, 

I I a\ Piyt\xt;0)p{xt\yi:t~i;6) 

p{xt\yi:t;0) = — , 

p{yt\yi:t-i] 0) 

p{yt\yi:t-i] 0) = / p{yt\xt; 9)p{xt\yi.,t^i; 9)dxt. 



T-l 



piy\0) = Ylp{yt+i\yi. 



(2) 



where yi-Q is a null observation. If the likelihood p{y\9) can be computed, maximum like- 
lihood and MCMC methods can be used to carry out inference on the parameters 9, with 
the states integrated out. When both the observation and state transition equations are 
l inear and Gaussian the likelihood can be evaluated analytically using the Kalman filter 



(ICappe and Ryden 



20051 . pp. 141-143). More general state space m odels can also 



De esti- 



and 


Friihwirth-Schnatter and Waener 


Shephard and Pitt 


( 


1997 


). See Section 



Kim and ChibI fll998l ) 



Cappe and Ryden 



( I2OO5I ) for a review of Markov 



chain Monte Carlo methods applied to general state space models. In general, however, the 
integrals in equation s (fTall-(fTcD are computationally intractable and the standard particle 



filter is proposed by 



Gordon et al 



(119931 ) as a method for approximating them with the 



approximation becoming exact as the number of particles tends to infinity. Appendix |X] de- 



scribes the standard particle fi l ter an d its use in approximating the expressions in fllal) - 



We refer to 



Pitt and ShephardI (119991 ) for a description of the auxiliary particle filter and 



Pitt 



(j2002r i for the efficient computation of the likelihood based on the auxiliary particle filter. 



3 Adaptive sampling 

Suppose that 7r{6) is the target density from which we wish to generate a sample, but that 
it is computationally difficult to do so directly. One way of generating the sample is to use 
the Metropolis-Hastings method, which is now described. Suppose that given some initial 9o 
the j — 1 iterates 6*1, ... , 6'j_i have been generated. We then generate 6j from the proposal 
density qj{9; 0) which may also depend on some other value of 6 which we call 6. Let be 
the proposed value of 9j generated from qj{9\ Oj_i). Then we take 9j = 6^ with probability 



mm 



(3) 



and take 9j = Oj^i otherwise. If qj{0;6) does not depend on j, then under appropriate 
regularity conditions we can show that the sequence of iterates 9j converges to draws from 



the target density vr(6'). See iTierneyl ( 1l994l ) for details. 

In adaptive sampling the parameters ofqj{9] 9) are estimated from the iterates 6'i, . . . , 9j-2 
Under appropriate regularity conditio ns the sequence of iterates 9j,j > 1, converges to draws 



from t he ta rget distribution ti{9 ) . See 



mm and 



Giordani and Kohru (120101 ). 



Roberts and Rosenthal! (120071 ) 



Roberts and Rosenthal 



In our applications the target distribution is p{9\y) is not available in a known closed form. 
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but the s tandard and aux ili ary particle filters pr ovide unbiased estimates of the likelihood 



function (IDel Moral 



2004h . 



Andrieu et al. 



(120101 ) show that we can view the particle filter 



approximation to the likelihood p{y\0) as the density of y conditional on 6 and a set of 
auxiliary uniform variables u such that p{y\0) = f{y\9,u) and 



fiy\9,u)fiu\9)du = piy\9). 



(4) 



It follows that f{9\y) = p{9\y) so that a method that simulates from f{9, u\y) yields iterates 
from the correct posterior p{9\y). In particular an adaptive sampling method using the 
particle filter to estimate the likelihood can be considered as an auxiliary variable method 
to sample from the augmented target p{9,u\y) such that the joint proposal distribution for 
9 and u is q{9,u;9) = q{9;9)p{u\9) with u a vector of uniform variables. The acceptance 
probability (jS]) for an adaptive proposal qj{9,u; 9) becomes 



a{9j^i,Uj^i;9^,u^) = min< 1, 



p{y\9^,u';)p{9n qM-i;0') 
p{y\9j-i, Uj_i)p{9j_i) q.j{9f,9.j_i) 



(5) 



If the adaptive proposal is independent, i.e. qj{9,u; 9) = qj{9,u), then 



a{9j_i,Uj^i;9^,u^) = min< 1 



p{y\9'^,u^)p{9n qj{9 



j-i) 



p{y\9j_i,Uj_i)p{9j_i) qj{9^j) 



(6) 



The two adaptive sampling schemes studied in the paper are discussed in appendix O 
The following convergence results hold for the adaptive inde pendent Metropolis Hasting s 



sampling scheme described in appendix IC.2I (and more fully in 



Giordani and Kohru (1201 



when it is combined with the standard particle filter. They follow from Theorems 1 and 2 of 



Giordani and KohnI ((20101). Let G be the parameter space of 9. 



Theorem 1 Suppose that (i) p{yt\xt]d) < (j^t for t = 1,...,T, where (pt is functionally 
independent of 6 E Q and Xt and (ii) p{d)/g2{6) < C for any 9 E Q where C is a constant 
and the density g2{9) is the second component in the mixture proposal. Then, 

1. The iterates 9j of the adaptive independent Metropolis Hastings sampling scheme con- 
verge to a sample from p{9\y) in the sense that 



sup I PT{9j eA)- / p{9\ y)d9 \ 

Ac® J A 



as J oo. 



(7) 



for all measurable sets AofQ. 

2. Suppose that h{9) is a measurable function of 9 that is square integrable with respect to 
the density g2 . Then, almost surely. 



1 " r 
_J2h{9,)^ / h{9)pi9\y)d9 

^ .7=1 



as n ^ oo. 



Proof. 



T-l 



Pivl^) = n P(yt+^\y^--t'^ 9) <Yl<Pt because 



t=o 



t=i 



1 

Piyt\yi:t-i; 9) = —'^p{yt\xl;9) < 



by (fTSll. This shows tha t the a pproximate likelihood is bounded and the result now follows 



from 



Giordani and Kohn 



1(201(1) when we make the second component heavy tailed compared 



to the prior, as outlined in that paper. 



The theorem apphes to the stochastic volatihty, negative binomial and Poisson state space 
models discussed in section [Has well as to binary and binomial state space models. 

We can obtain a similar convergence result for the auxiliary particle filter if it is modified 
in a straightforward way to ensure that the importance weights are bounded. The proof is 
outlined in appendix [B] 

Theorem 2 Subject to the conditions of theorem U\ and the construction of the importance 
weights in appendixW\ the results of theoremUl also hold for the auxiliary particle filter. 

3.1 Adaptive sampling and parallel computation 

Our work uses parallel processing for adaptive sampling in two ways. Suppose J processors 
are available. The first approach applies to any sampling scheme. The likelihood is estimated 
for a given 6 on each of the processors using the particle filter with M particles and these 
estimates are then averaged to get an estimate of the likelihood based on JM particles. This 
approach is similar to, but faster, than using a single processor and makes it possible to 
estimate the likelihood using a large number of particles. 

The second approach applies mainly to independent Metropolis-Hastings sampling schemes 
and consists of iterating on the following three steps. Let 9'^ the current value of 9 gener- 
ated by the sampling scheme and qc{9) the current proposal density for 9. (a) For each of 
J processors generate K proposed values of 9, which we write as 9^^l,k = 1,...,K, and 
compute the corresponding logs of the ratios p{y\9^l)p{9^^l) / q{9^^l) . (b) After each K block 
of proposed values is generated for each processor, carry out Metropolis-Hastings selection 
of the JK proposed {9^^^} parameters using a single processor to obtain {9j^k} draws from 
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the chain. This is fast because drawing uniform variates is the only computation that is 
necessary, (c) Use the previous iterates and the 6j^k to update the proposal density qdO) and 



3.2 Estimating the marginal likelihood 

Marginal likelihoods are often used to compare two or more models. For a given model, let 
6 be the vector of model parameters, p{y\0) the likelihood of the observations y and p{d) the 
prior for 6. The marginal likelihood is 



It is often difficult to evaluate or estimate p{y) and appendix [D] briefly outlines how it can be 
estimated using bridge and importance sampling, with the computation carried out within 
the adaptive sampling so that a separate simulation run is unnecessary. 

4 Performance of the adaptive sampling schemes 

This section compares the performance of the two adaptive Metropolis-Hastings sampling 
schemes discussed in section |3]using both the standard particle fllter and the auxiliary particle 
fllter. The comparisons are carried out for several models using real data and illustrate 




(9) 



which in our case becomes 




(10) 
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the flexibility and wide applicability of the approach that combines particle filtering with 
adaptive sampling. The comparison is in terms of the acceptance rates of the Metropolis- 
Hastings methods, the inefficiency factors (IF) of the parameters, and an overall measure of 
effectiveness which compares the times taken by each combination of sampler and particle 
filter to obtain the same level of accuracy. We define the acceptance rate as the percentage 
of accepted values of each of the Metropolis-Hastings proposals. We define the inefficiency 
of the sampling scheme for a given parameter as the variance of the parameter estimate 
divided by its variance if the sampling scheme generates independent iterates. We estimate 
the inefficiency factor as IF = 1 + 2 Pj, where pj is the estimated autocorrelation at lag 
j. As a rule of thumb, the maximum number of lags L that we use is given by the lowest index 
j such that \f)j\ < 2/\fK where K is the sample size used to compute pj. The acceptance 
rate and the inefficiency factor do not take into account the time taken by a sampler. To 
obtain an overall measure of the effectiveness of a sampler, we define its equivalent computing 
time ECT = 10 x IF x t, where t is the time per iteration of the sampler. We interpret 
ECT as the time taken by the sampler to attain the same accuracy as that attained by 10 
independent draws of the same sampler. For two samplers a and 6, ECTa/ECTh is the ratio 
of times taken by them to achieve the same accuracy. 

We note that the time per iteration for a given sampling algorithm depends on how the 
algorithm is implemented, i.e. the language used, whether operations are vectorized, etc. 
Thus the implementation of the sampling scheme affects its ECT, but not the acceptance 
rates nor the inefficiencies. Implementation details are given in appendix [El 
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4.1 Example 1: Stochastic volatility model 

The first example considers the univariate stochastic volatility (SV) model 



Vt = Kt exp(xt/2)et, St ~ Ar(0, 1) 

(11) 

Xt = fl + (p{Xt-l - fl) + ariTIt, 77t~A/'(0, 1) 



where coTT{et,r]t) = p, Pr(i^j = 2.5) = u and Pr(/it = 1) = 1 — u, with u << 1. This is a 
state space model with a non-Gaussian observation equation and a Gaussian state transition 
equation for the latent volatihty Xt which follows a first order autoregressive model. The SV 
model allows for leverage because the errors in the observation and state transition equations 
can be correlated. The model also allows for outliers in the observation equation because 
the standard deviation of yt given Xt can be 2.5 its usual size when Kt = 2.5. To complete 
the model specification, we assume that all parameters are independent a priori with the 
following prior distributions: fi ~ A/'(0, 10), (p ~ rA/'(o,i)(0.9, 0.1), ~ J^(0.01, 0.01), 
Xo ~ A/'(0, 10), and p ~ TA/'(-i,i)(0, 10^) where N'{a,h) means a normal distribution with 
mean a and variance 6^, TM [c,d){0'i^) means a truncated normal with location a and scale 
h restricted to the interval (c, d) and XQ (a, h) is an inverse gamma distribution with shape 
pa rameter a, s cale p arameter h and mode h/{a + 1). We set u = 0.03 in the general model. 



Malik and Pitt 



ShephardI (20051) reviews SV models and a model of the form f|TT]) is estimated by 



(120081 ) by maximum likelihood using the smooth particle filter. 
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4.1.1 S&P 500 index 



We apply the SV model to the Standard and Poors (S&P) 500 data from 02/ Jan/1970 
to 14/Dec/1973 obtained from Yahoo Finance web sitqj. The data consists of T = 1 000 
observations. 

Table [T] presents the results of a Monte Carlo study using twelve replications with different 
random number seeds for the SV model with leverage using the first parallel computation 
method described in 13.11 Implementation details are given in appendix lE.il The table shows 
that the adaptive independent Metropolis-Hastings sampling scheme is at least seven times 
more efficient than the adaptive random walk Metropolis sampling scheme for both particle 
filters. 

Table 1: Medians and interquartile range (IR) of the acceptance rates and the inefficiencies 
(minimum, median and maximum) and ECT = IF x time for ten iterations over twelve 
replications of the stochastic volatility model with leverage to the S&P 500 data. 



Algorithm 


Ac. Rate 


Min. Incf. 


Median Inef. 


Max. Inef. 


Median ECT 


Median IR 


Median IR 


Median IR 


Median IR 


Median IR 




Standard Particle Filter 


RWM3C 
IMH-MN 


27.70 1.72 
60.32 2.68 


16.40 3.69 
2.28 0.30 


22.54 4.89 
2.65 0.53 


28.26 8.78 
3.45 2.52 


19.35 3.82 
2.29 0.48 




Auxiliary Particle Filter 


RWM3C 
IMH-MN 


27.76 1.76 
59.48 3.78 


18.14 3.72 
2.29 0.38 


24.28 6.33 
2.71 0.75 


30.06 9.76 
3.59 1.23 


31.19 8.43 
3.44 0.96 



We use importance sampling and bridge sampling to compute the marginal likelihoods of 

the four SV models: the model with no leverage effect (p = 0) and no outlier effect (u; = 0), 

the model that allows for leverage but not outliers, the model that allows for outliers but no 

leverage and the general model that allows for both outliers and leverage. Table [2] shows the 

logarithms of the marginal likelihoods of the four models for a single run of each algorithm. 

The differences between the two approaches are very small. In this example, and based on our 
^ http: / / au.finance.yahoo.coni/ q/hp?s= ' GSPC| 
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prior distributions, the SV model with leverage effects has the highest marginal likelihood. 



Table 2: Logarithms of the marginal likelihoods for four different SV models for the two 
particle filter algorithms computed using the adaptive independent Metropolis Hastings al- 
gorithm. BS and IS mean bridge sampling and importance sampling. 



Model 


Standard Particle Filter 


Auxiliary Particle Filter 


log(pss(y)) 


log (p/s(y)) 


log(pss(2/)) 


log(p/s(?/)) 


SV 


-1072.9 


-1072.9 


-1072.9 


-1072.9 


SV Lev. 


-1065.0 


-1065.0 


-1065.0 


-1065.0 


SV Out. 


-1076.6 


-1076.6 


-1076.5 


-1076.4 


SV Lev. Out. 


-1069.3 


-1069.3 


-1069.2 


-1069.3 



We also ran a simulation using the second parallel computing method described in section 
I3.lt using 10 000 iteration of the adaptive independent Metropolis Hastings samplers running 
on eight processors. Further implementation details are given in apppendix IE. II Table 
[3] summarizes the results. The table shows that the ECT of the adaptive random walk 
Metropolis algorithm is over 30 times larger than the ECT of the adaptive independent 
Metropolis Hastings algorithm because the latter takes advantage of the parallelization. 

Table 3: Medians and interquartile range (IR) of the acceptance rates and the inefficiencies 
(minimum, median and maximum) and ECT = IF x time for ten iterations over twelve 
replications of the stochastic volatility model with leverage to the S&P 500 data using the 
standard particle filter and parallel computing on eight processors. 



Algorithm 


Ac. Rate 
Median IR 


Min. Inef. 
Median IR 


Median Inef. 
Median IR 


Max. Inef. 
Median IR 


Median ECT 
Median IR 


RWM3C 
IMH-MN 


27.22 2.42 
58.15 1.90 


18.76 2.64 
2.56 0.40 


22.00 8.15 
2.98 0.54 


31.50 16.03 
4.38 2.55 


68.63 25.60 
1.39 0.27 



4.2 Example 2: Negative binomial model 



Pitt and Walkerl (120051 ) consider the following Poisson gamma model and give an MCMC 
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method to estimate it. 



?/i I t^t ~ V{LJt) , ujt I zt-i ~ g{u + zt-i, a + P) 
2t I cut ~ V{auJt) , iOt ~ Q{y, P), 



(12) 



where a > 0, /5 > and z/ > 0. We can integrate Ut out in f[T^ to obtain the negative- 
binomial model 



Vt I zt 



V a + p + 1 



Zt I zt^i ~ A/'S ( u + 



a + /3 
2a + /3 



with Zt ~ AT-B f z/, -^—^ . 



(13) 



We use the notation V{a) for a Poisson distribution with mean a, ^(a, 6) for a gamma 
distribution with shape parameter a, scale parameter h and mean a/h and N'B{r,p) is a 
negative binomial distribution with r number of successes, p the pr obabihty of succe s s, and 



Pitt and Walkeil (120051 ) 



with mean r(l — p)/p. One of the advantages of the approach of 
is that it is easy to obtain the marginal distribution of y^, which in this example is yt 
NB{v,p/{p + l)). 



4.2.1 Weekly firearm homicides in Cape Town 



his section fits the negative bi nomial model f|T3l) to the number of weekly firearm homicides 



(IMcDonald and Zucchini 



19971 . pp. 194-195) in Cape T own from January 1, 198 6 to Decem 



ber 31, 1991, (T = 313 observations) shown in figure [H 



Pitt and Walkerl fl2005[ ) also fit this 



model to the data. We also fitted to this data a state-space Poisson model with a random 
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50 100 150 200 250 

Week 



Figure 1: Number of weekly firearm homicides in Cape Town from January 1, 1986 to 
December 31, 1991. 

walk transition equation, 

Vt I /it ~P(exp(/it)) , nt = Ht~i + (J6t, St ~ A/'(0, 1). (14) 

because figured] suggests a possible nonstationarity in the data towards the end of the series. 

The prior distributions for both model are based on our empirical analysis of the data. We 
assume that the parameters are independent a priori with prior distributions v ~ 7iA/'(25), 
j3 ~ HA/'(25) and a ~ 7iA/'(400) for the negative binomial model, and cr^ ~ HAf{l) and 
/io ~ 7V(0.4324, 9) for the Poisson model, where 7YA/'(6^) stands for a half-normal distribution 
with scale b. 

Table m presents the results of a Monte Carlo study using twelve replications with different 
random number seeds for the negative binomial model. This simulation is based on the the 
first parallel computation method described in 13.11 The implementation details are given in 
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appendix IE. 2[ The table shows that the inefficiencies of the adaptive random walk Metropolis 



are at least seven times as large as those of the adaptive independent Metropolis Hastings. 



Table 4: Medians and interquartile range (IR) of the acceptance rates and the inefficiencies 
(minimum, median and maximum) and ECT = IF x time for twelve replications of the 
negative binomial model applied to the homicide data. 



Algorithm 


Ac. Rate 


Min. Incf. 


Median Inef. 


Max. Incf. 


Median ECT 


Median IR 


Median IR 


Median IR 


Median IR 


Median IR 




Standard Particle Filter 


RWM3C 
IMH-MN 


25.32 2.26 
62.90 2.68 


16.79 3.47 
2.24 0.77 


21.79 4.85 
2.65 0.83 


27.33 12.39 
3.82 3.35 


31.95 8.14 
4.18 1.33 




Auxiliary Particle Filter 


RWM3C 
IMH-MN 


24.58 1.78 
56.72 4.74 


19.03 5.81 
2.42 1.07 


23.58 5.61 
2.93 1.27 


33.81 10.07 
3.82 3.38 


37.00 8.76 
5.08 2.40 



Table [5] shows that the marginal likelihood of the negative binomial model is greater than 
that of the Poisson random walk model for the prior distributions chosen. A summary of the 



posterior distributions of the model paramete rs for the negative bino mial model (not shown) 



provides similar results to those presented in 



Pitt and Walker! (120051 ). 



Table 5: Logarithms of the marginal likelihood estimates for the negative binomial and the 
Poisson models for the two particle filter algorithms computed using the adaptive independent 
Metropolis Hastings algorithm. BS and IS mean bridge sampling and importance sampling. 



Model 


Standard Particle Filter 


Auxiliary Particle Filter 


^og{pBs{y)) log(p/s(y)) 


\og{pBs{y)) log{pis{y)) 


N. Binomial 
Poisson 


-620.6781 -620.6582 
-625.3963 -625.3939 


-620.5979 -620.6767 
-625.4304 -625.4298 



We also ran a simulation using the second parallel computing method described in sec- 
tion 13. H using 10 000 iteration of the adaptive independent Metropolis Hastings samplers 
running on eight processors. Implementation details are the same as for the second simula- 
tion in section l4.1.1[ Table [6] summarizes the results and shows that the ECT of the adaptive 
random walk Metropolis algorithm is nearly 50 times larger than the ECT of the adaptive 
independent Metropolis Hastings algorithm. 
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Table 6: Medians and interquartile range (IR) of the acceptance rates and the inefficiencies 
(minimum, median and maximum) and ECT = IF x time for twelve replications of the 
negative binomial model applied to the homicides data using the standard particle ffiter and 
parallel computing on eight processors. 



Algorithm 


Ac. Rate 
Median IR 


Mill. Incf. 
Median IR 


Median Inef. 
Median IR 


Max. Inef. 
Median IR 


Median ECT 
Median IR 


RWM3C 
IMH-MN 


24.68 1.96 
52.52 12.64 


18.46 6.40 
2.98 1.05 


25.72 8.13 
3.37 1.69 


36.14 11.89 
4.41 2.64 


103.14 33.89 
2.04 1.01 



4.3 Example 3: Poisson model 

This section considers a state space model with a Poisson observation equation, dynamic 
level and slope equations as well as explanatory variables 



(15) 



yt ~ P(exp(a;j/3 + /^t + St)) 

lit = fit-i + o-t-i + SI{t = tint) + cret, Et ~ A/'(0, 1), 

at = at^i + Tit, 6~Ar(0,l), 
J 

St = {ttj cos {wjt) + 7j sin {oOjt)} , 

where Ut = '^irj/h so that St has period h. The variable I{t = tint) = 1 if t = tint and 
otherwise so the model allows for a change in level in the fit equation if 5 7^ 0. We assume that 
the parameters are independent a priori with the following prior distributions: /3 ~ N(0, v^^/), 
fio ~ N(7Io, ^l), ao ~ N(0, ^l), ~ HN(0, ^l,), ~ HN(0, <^2,), 5 ^ N(0, 1), a, ~ N(0, ^l), 
and ~ N(0, (^2)^ for j = 1, . . . , J. 



4.3.1 Killed or seriously injured children in Linz 

The ffist application of the Poisson model is the number of children aged 6-10 that were killed 
or seriously injured by motor vehicles in Linz, Austria, from 1987 to 2002, corresponding to 



T = 192 observations. The data is analyzed by lFriihwirth-Schnatter and Wagnerl (120061 ). We 
fit the Poisson model at (fT5|) to the data. The seasonal patt ern in our model uses a Fourier se- 



ries re presentation that differs from the state space model in 



Friihwirth-Schnatter and Wagner 



(120061 ). We also include the same exp 



of children living in Linz, as used by 



anatory variable Xt = logf^;,), where Zt i s the number 



Friihwirth-Schnatter and Wagnerl (120061 ). The coeffi- 



cient f3 at (1151) is set to 1 so there a multiplicative effect on the mean of the Poisson model. 
The hyperparameters in the prior are based on an empirical analysis of the data using the 



19481 ) transform. In particular, we set Hq = —8.3779 , if't = 1.5, 



Ascombe (lAnscombd . 

= '/'a = = 0.005, v^^a = 0.2, (^^2 = 0.002 and uj = 27r/12. An intervention parameter 
I{t = tint) is included in the model to capture a possible decrease in the level of the series 
due to a change in the law in Linz in October 1, 1994 (tint = 95) as can be seen in figure O 





1987 1989 1991 1993 1995 1997 1999 2001 2003 

Month 

Figure 2: Monthly counts of killed or injured children from 1987 to 2002 in Linz. 



Table [7] presents the results of a Monte Carlo study using twelve replications with different 
random number seeds for the Poisson model. This simulation is based on the the first parallel 
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computation method described in l3.1[ The implementation details are given in appendix IE.3[ 
The table shows that the inefficiencies of the adaptive random walk Metropolis are at least 
seven times as large as those of the adaptive independent Metropolis Hastings. 



Table 7: Medians and interquartile range (IR) of the acceptance rates and the inefficiencies 
(minimum, median and maximum) and ECT = IF x time for twelve replications of the level 
and trend state-space poisson model applied to the Linz data. 



Algorithm 


Ac. Rate 


Min. Inef. 


Median Inef. 


Max. Inef. 


Median ECT 


Median IR 


Median IR 


Median IR 


Median IR 


Median IR 




Standard Particle Filter 


RWM3C 
IMH-MN 


25.71 1.88 
40.90 5.23 


40.61 5.22 
4.15 0.88 


56.41 6.11 
6.04 2.56 


81.04 9.57 
15.90 29.70 


14.54 1.44 
1.55 0.66 




Auxiliary Particle Filter 


RWM3C 
IMH-MN 


27.12 1.77 
41.78 3.80 


40.38 7.52 
4.16 0.72 


55.94 5.27 
6.47 2.14 


80.31 22.59 
22.03 28.13 


21.76 2.09 
2.43 0.74 



We compared the eight models given in table [8] using marginal likelihood. All the mod- 
els with seasonal effects include five harmonics. The table shows that the simplest model 
is slightly b etter than the level and intervent i on mo del which is consistent with the results 



reported in 



Friihwirth-Schnatter and Wagner! (120061 ). However, the intervention parameter 



is clearly negative with high probability and two of the seasonal coefficients have high prob- 
ability of being different from zero (results not shown). The bridge and importance samplers 
give similar estimates of the marginal likelihoods. 

Table 8: Logarithms of the marginal likelihoods for different Poisson models for the two 
particle ffiter algorithms. BS and IS mean bridge sampling and importance sampling. 



Poisson model 


Standard Particle Filter 


Auxiliary Particle Filter 


log(pi3s(y)) 


log{pis{y)) 


log(pi3s(y)) 


log(p/s(y)) 


Level 


-320.984 


-320.987 


-320.995 


-320.995 


Level and trend 


-333.110 


-333.122 


-333.100 


-333.100 


Level and intervention 


-321.279 


-321.278 


-321.285 


-321.291 


Level, trend and intervention 


-333.867 


-333.866 


-333.861 


-333.857 


Level and seasonality 


-328.438 


-328.436 


-328.451 


-328.444 


Level, trend and seasonality 


-343.214 


-343.202 


-343.246 


-343.247 


Level, intervention and seasonality 


-328.632 


-328.640 


-328.662 


-328.657 


Level, trend, intervention and seasonality 


-341.253 


-341.246 


-341.267 


-341.264 
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We also ran a simulation using the second parallel computing method described in section 
13.11 with 20 000 iteration of the adaptive independent Metropolis Hastings sampler running 
on eight processors. Implementation details are in appendix IE.3I Table M summarizes the 
results and shows that the ECT of the adaptive random walk Metropolis algorithm is nearly 
50 times larger than the ECT of the adaptive independent Metropolis Hastings algorithm. 

Table 9: Medians and interquartile range (IR) of the acceptance rates and the inefficiencies 
(minimum, median and maximum) and ECT = IF x time for twelve replications of the level 
and trend state-space Poisson model applied to the Linz data using the standard particle 
filter and the parallel computing in eight processors. 



Algorithm 


Ac. Rate 
Median IR 


Mill. Incf. 
Median IR 


Median Inef. 
Median IR 


Max. Inef. 
Median IR 


Median ECT | 
Median IR 


RWM3C 
IMH-MN 


26.37 0.86 
40.07 4.41 


40.60 7.57 
4.54 0.81 


56.41 4.40 
6.86 2.99 


85.81 26.95 
27.26 38.01 


107.70 13.41 
1.90 0.81 



4.3.2 Sydney asthma data 

This example models the time series of daily counts of asthma presentations at the accident 
and emergency department of Campbelltown Hospital located in southwest metropolitan 
area of Sydney, figure [3] is a plot of the data, w hich has 1461 observations from January 
1, 1990 to Decemb er 31, 1994. 



Davis et al 



Davis et al. 



(120031 ) analyze this data using a Poisson model. 



(120031 ) argue that the peaks in the series can be lined up with the four terms 



in the school year with the break between the first and second terms occurring at vary- 
ing times because of the timing of the Easter vacation. They include only one harmonic, 
(a cos(27rt/365) + 7 sin(27rt/365)) to model the seasonal effect, and model the peaks by con- 
structing the explanatory variable 



t-T- 
100 



for z = 1, 2, 3, 4 and j = 1, . . . , 1461 



21 



where Tij is the start time for the j'th school term in year i and p{x) oc x°'~^{l — x)^~^ 
(a beta density), with parameter a = 2.5 and b = 5. There are sixteen such explanatory 
variables but their preliminary analysis only includes eight of them corresponding to terms 
1 and 2 across all four years. They also include the following explanatory variables: Sunday 
and Monday effects (dummy variables), maximum daily ozone, maximum daily N02 and 
humidity. We apply model f[T5]) to the asthma data, but without the intervention variable. 




1990 1991 1992 1993 1994 

Day 

Figure 3: Counts of asthma presentation in Campbelltown Hospital. 



i.e. taking 6 identically zero, and use the hyperparameters = 0.02, /Jg = 0.5093 , ip'j^ = 20, 
(fil = 0.001, (/3^2 = 1-5, V9^2 = 0.002, if'^ = (f"^ = 10, Uj = 27r/365, which are obtained through 
an empirical analysis of the data. Table [TU] presents the results of a Monte Carlo study 
using twelve replications with different random number seeds for the Poisson model with 
level, trend and seasonality, and the explanatory variables. This simulation is based on the 
first parallel computation method described in 13. H with the implementation details given in 
appendix IE. 4[ The table shows that the inefficiencies of the adaptive random walk Metropolis 
are at least twice as large as those of the adaptive independent Metropolis Hastings. 
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Table 10: Asthma data: Medians and interquartile ranges (IR) of acceptance rates and the 
inefficiencies (minimum, median and maximum) and ECT = IF x time for twelve replications 
of the level and slope state-space Poisson model. 



Algorithm 


Ac. Rate 


Min. Inef. 


Median Inef. 


Max. Inef. 


Median ECT 


Median IR 


Median IR 


Median IR 


Median IR 


Median IR 




Standard Particle Filter 


RWM3C 
IMH-MN 


25.59 1.09 
27.52 7.75 


68.46 7.00 
8.51 6.18 


88.50 2.81 
16.45 15.62 


119.74 16.62 
52.26 49.66 


148.59 5.18 
27.73 26.34 




General Auxiliary Particle Filter 


RWM3C 
IMH-MN 


26.52 3.45 
18.45 7.22 


73.96 8.06 
14.52 7.32 


87.41 1.64 
26.16 11.38 


114.99 14.29 
55.21 16.44 


219.95 4.26 
66.03 28.71 



Table [TT] uses marginal likelihood to compare the model with just the level fit in the 
transition equation to a model containing the level Ht and the trend at, with the simpler 
model preferred. 

Table 11: Logarithms of the marginal likelihood estimates for the two Poisson models es- 
timated using the two particle filters. BS and IS mean bridge sampling and importance 
sampling. 



Model 


Standard Particle Filter 


Auxiliary Particle Filter 


log{pBs{y)) \og{pis{y)) 


^og{pBs{y)) \og{pis{y)) 


Level and seasonality 
Level, trend and seasonality 


-2558.3 -2558.3 
-2577.3 -2577.3 


-2558.2 -2558.3 
-2577.2 -2577.2 
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Appendices 



A Standard particle filter 



This section outline s the standard Sampling-Importance-Resampling (SIR) particle ffiter of 



Gordon et al. 



(119931 ). We suppress the dependence on the fixed parameter 6 for notational 
convenience. Suppose that that we have samples x'l_i ~ p{xt^i\yi;t^i) for k = 1,...,M. 
The particle ffiter works by taking this sample, from the filtering density at time t — 1, 
and translating it into a sample from the filtering density at time t. The first step involves 
simply passing each of these samples through the transition density to obtain ~ p{xt\x'l_i), 
for k = 1, M, which produces samples which are approximately distributed from equation 
fllaj) . These samples {x^} are therefore samples from the prediction density (we denote ffitered 
samples as x and the corresponding predictive samples as x). To each of these samples, for 
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k = 1, M, we attach the following weights, u^, and corresponding masses, tt. 



t ' 



=p{yt\xt), 



(16) 



This collection | (x^, vr^) }^-^ is now a discrete approximation to the filtering density p{xt\yi;t)- 
Explicitly, we may write this approximation, in terms of Dirac-delta functions, 6{-), as. 



M 



(17) 



k=l 



We need to resample from this mass function to obtain an equally weighted sample. H owever, 



prior to doing this we may estimate the term (fTc|) unbiasedly (jPel Moral 
denominator at equation f|T6|) . 



200J) by the 



^ M I ^ 



k=l 



k=l 



We may also estimate any moments under the filtering density, say E[g{xt)\yi:t], in the Rao- 
Blackwellised form as, 

M 
k=l 

Typically these estimators are more efficient than using the resampled analogs. 

To produce an equally weighted sample from equation ( flTl) . we need only think about 
the sampling of the discrete univariate index k with mass for k = 1,...,M. This is a 
multinomial sample and is the equivalent of a weighted bootstrapo We then have a sample 



^Thc SIR filter is sometimes referred to as tlie bootstrap filter for this reason. 
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zj, ■■■,z^ of resampled indices. Having sampled in this manner, we can now associate our 
resampled points, which we call for k = 1, M, with the predictive points, 

x'l = xf for k = l, ...,M. 

The method now proceeds to the next time step in a similar fashion. 

We may replace the multinomial resampling (weighted bootstrap) procedure with a strat- 
ified sampling step instead. This does not affect the validity of the particle filter or the 
subsequent MCMC strategy that we pursue. 



B Proof of theorem 2 



This section briefly outlin es the conditions for theo rem [2] to hold and its proof. The generic 



auxiliary particle filter of 



Pitt and ShephardI (119991 ) uses an importance density of the form 



p{yt\z^; 9)p{xt\x^_{), where z^ is a suitable value of Xt- We replace the term p(yt\z^; 9) in 



the im portance density by e(f)t + (1 — ^)p{yt\zf', 0) where (^t is defined in theorem [H By 



Pitt 



(I2OO2I ). the term (fTcIl is estimated unbiasedly by 



p{yt\yi: 



1 ^ 

k=l 



M 



-y 



^e0t + (l-£)p(yt|zt^ 



k. . 



It follows that v{yt\y\:t-\\ 0) < (pf The rest of the proof is the same as that of Theorem [H 
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C Adaptive sampling schemes 

This appendix describes the two adaptive samphng schemes used in the paper. 



C.l Adaptive random walk Metropolis 



The adaptive random walk Metropohs proposal of 



Roberts and Rosenthal! (120091 ) is 



qj{6] ej_i) = LOij(f)d{0] Oj-i, ft^iSi) + uJ2j(f)d{0] Oj-i, i^2^2j) 



(19) 



where d is the dimension of 9 and </)d(6'; 6*, S) is a multivariate d dimensional normal density in 
9 with mean 9 and covariance matrix S. In ( |T9l) . uoij = 1 for j < Jq, with jo representing the 
initial iterations, uij = 0.05 for j > jo with uj2j = 1 — uiij] fi^i = O.l^ /c?, /t^ = 2.38^/d, Si is a 



constant covariance matrix, which is taken as the identity matrix by 



Roberts and Rosenthal 



(120091 ) but can be based on the Laplace approximation or some other estimate. The matrix 



is the sample covariance matrix of the first j — 1 iterates. The scalar ki is meant to achieve 
a high ac ceptance rate by mo ving the sampler locally, while the scalar k,2 is considered to be 



optimal (IRoberts et al. 



19971 ) for a random walk proposal when the target is a multivariate 
normal. We note that the acceptance probability ([3]) for the adaptive random walk Metropolis 
simplifies to 



^^^^^ Piy\e',,u^,)pm 

' p{y\9j_i,Uj.i)p{9j, 



(20) 



We refine the two component random walk Metropolis proposal in ([T9|) by adding a third 
component with Ti^j = and with K3 = 25 ^ ki, K2- We take u^j = if j < Jq, uj^j = 0.05 
for j > jo and uj2j = 1 —ujij — i^3j- We refer to this proposal as the three component adaptive 
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random walk. The purpose of the heavier tailed third component is to allow the sampler to 
explore the state space more effectively by making it easier to leave local modes. 



C.2 A proposal density based on a mixture of normals 



he p roposal density of the adaptive independent Metropolis-Hastings approach of 



Giordani and Kohn 



(120101 ) is a mixture with four terms of the form 



^i(^) = ^^^kjgkiOlXkj) tOkj > 0, for = 1, . . . ,4 and "^ujkj = 1 , (21) 

fe=l k=l 

with Xkj the parameter vector for the density gkj{d:,\kj)- The sampling scheme is run in 
two stages, which are described below. Throughout each stage, the parameters in the first 
two terms are kept fixed. The first term gi{9\\ij) is an estimate of the target density and 
the second term g2{d\^2j) is a heavy tailed version of gi{9\\ij). The third term g^[9\\-i,j) is 
an estimate of the target that is updated or adapted as the simulation progresses and the 
fourth term g/^{9\\ij) is a heavy tailed version of the third term. In the first stage gij{6; Xij) 
is a Gaussian density constructed from a preliminary run, of the three component adaptive 
random walk. Throughout, g2iO\X2j) has the same component means and probabilities as 
gi{6\Xij), but its component covariance matrices are ten times those of gi{6\Xij). The term 
93{(^\^3j) is a mixture of normals and g4^{6\X4^j) is also a mixture of normals obtained by 
taking its component probabilities and means equal to those of g3{6\X3j), and its component 
covariance matrices equal to 20 times those of g^i^OlX^j). The first stage begins by using 
gi{0\Xij) and g2{0\X2j) only with, for example, uiij = 0.8 and uj2j = 0.2, until there is a 
sufficiently large number of iterates to form g^lOlX^j). After that we set uij = 0.15,ci;2j = 
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0.05, uj^j = 0.7 and u^j = 0.1. We begin with a single normal density for g^{6\X3j) and as 
the simulation progresses we add more components up to a maximum of six according to a 
schedule that depends on the ratio of the number of accepted draws to the dimension of 6. 

In the second stage, gi{6\Xij) is set to the value of gsiOl^^j) at the end of the first stage 
and g2{d\X2j) and 5'4(6'|A4j) are constructed as described above. The heavy-tailed densities 
92{&\^2j) and g4^{9\X4^j) are included as a defensive strategy to get out of local modes and to 
explore the sample space of the target distribution more effectively. 

It is computationally too expensive to update gsiOlX^j) (and hence gi{9\Xij)) at every 
iteration so we update them according to a schedule that depends on the problem and the 
size of the parameter vector. 



D Marginal likelihood evaluation using bridge and im- 



portance sampling 



Suppose t hat q(6) is an approxiin ation to p{0\y) which can be evaluated explicitly. Bridge 
sampling (IMeng and Wong . 



19961 ) estimates the marginal likelihood as follows. Let 



where f/ is a positive constant. Let 



A = I t{e)q{e)p{e \ y)de 

Ai 



A 



p{y) 



Then, 

where Ai = / t{9)q{e)p{y \ e)p{e)de 
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Suppose the sequence of iterates {9^^\j = 1, . . . , M} is generated from the posterior density 
p{9\y) and a second sequence of iterates {9^''\ k = 1, . . . , M} is generated from q{9). Then 



^ 1 ^ ^ 1 ^ A 

j=i k=i ^ 

are estimates of A and and Pssiv) is the bridge samphng estimator of the marginal 
hkehhood p{y). 

In adaptive samphng, q{6) is the mixture of normals proposal. Although U can be any 
positive constant, it is more efficient if t/ is a reasonable estimate of p{y). One way to do 
so is to take U = p{y\0*)p{9*) / q{9*) ^ where 9* is the posterior mean of 9 obtained from the 
posterior simulation. 

An alternative method to estimate of the m arginal likelihood viy) is to use impo rtance 



sampling based on the proposal distribution q{9) (IGeweke 



1989 



Chen and Shao 



19971 ). That 



IS, 



1 ^ 

Pis{y) = -^Y. 



p{y\9^^^)p{9^''^) 



k=l 



q{9^>')) 



Since our proposal distributions have at least one heavy tailed component, the importance 
sampling ratios are likely to be bounded and well-behaved, as in the examples in this paper. 



E Implementation details and sampling schedules 



We coded most of the algorithms in MATLAB, with a small proportion of the code written 



le resampling 



using C/Mex files. For the particle filters, we also use a C/Mex file for t 
step using an efficient algorithm to draw from a general discrete distribution ([Walker 



19771 ) 
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available as a C function in the GNU Scientific Library fiGalassi et al.l . 120091 ) . We carried 
out the estimation on an SGI cluster with 42 compute nodes. Each of them is an SGI Altix 
XE320 with two Intel Xeon X5472 (quad core 3.0GHz) CPUs with at least 16GB memory. 
We ran parallel jobs using up to eight processors and MATLAB 2009. 

E.l Implementation details for the S&P 500 index data 

This section gives the implementation details for the first and second simulations in sec- 
tion 14.1.11 The first simulation uses a sampling rate of M = 3000 particles for each time 
period for each of eight processor, so each step of the particle filter uses 24 000 particles. 
The number of iterations of the adaptive samplers is 10 000 with the updates of the pro- 
posal distributions for the adaptive independent Metropolis-Hastings samplers performed at 
iterations 100, 200, 500, 1 000, 2 000, 3 000, 4 000, 5 000, 6 000 and 7 500. The adaptive 
independent Metropolis Hastings sampling scheme is initialized using 2000 iterations of the 
ARWM3C. The initial proposal for all the AIMH algorithms are based on a multivariate 
normal distribution estimated from these draws and the initial starting values are the sample 
means. 

The second simulation uses eight processors with block sizes for each processor of 15, 
25, 60, 125, 250, 375, 500, 625, 750 and 940, corresponding to 120, 200, 480, 1 000, 2 000, 
3 000, 4 000, 5 000, 6 000 and 7 520 proposed parameter values before each update of the 
proposal density. The proposal distribution is updated at the completion of each block. In 
this simulation M = 10 000 for the standard particle filter. 
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E.2 Implementation details for the weekly homicide data 

This section gives the implementation details for the first and second simulations in sec- 
tion 14.2. 1[ The sampling rate for the first simulation is M = 2 500 particles on each of 
eight parallel processors, so each step of the particle filter uses 24 000 particles. The number 
of iterations of the adaptive sampling algorithms is set to 10 000 with the updates of the 
proposal distributions for the adaptive independent Metropolis-Hastings samplers performed 
at iterations 100, 200, 500, 1 000, 2 000, 3 000, 4 000, 5 000, 6 000, and 7 500. The simulation 
is initialized as in lE.li 

The schedule for the second simulation is the same as in section lE.li 

E.3 Implementation details for the analysis of the Linz data 

This section gives the implementation details for the first and second simulations in sec- 
tion 14.3. 1[ The sampling rate for the first simulation is M = 4 500 particles in each of eight 
parallel processors. The number of iterations is 20 000 with updates of the proposal distribu- 
tion for the adaptive independent Metropolis-Hastings sampler performed at iterations 300, 
1 000, 3 000, 5 000, 10 000, and 15 000. The simulation is initialized as in EH 

For the second simulation the number of iterations of the adaptive samplers is 20 000 with 
the adaptive independent Metropolis-Hastings sampler running on eight processors with the 
block sizes for each processor 40, 125, 375, 625, 1250, and 1875, corresponding to 320, 1000, 
480, 3 000, 5 000, 10 000 and 15 000 proposed parameter values. The updates of the proposal 
distribution occur at the end of each block. In this simulation, M = 30 000 particles for the 
standard particle filter. 
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E.4 Implementation details for analysis of the asthma data 

This section gives the implementation details for the simulation in section H:.3.2[ The sampling 
rate for the simulation is M = 4 000 particles in each of eight parallel processes. The number 
of iterations is set to 50 000 with the updates of the proposal distributions for the adaptive 
independent Metropolis-Hastings samplers performed at iterations 1 000, 2 000, 3 000, 4 000, 
5 000, 7 000, 8 000, 10 000, 15 000, 20 000, 25 000, 30 000, and 40 000. 
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